131 research outputs found

    Fully-unsupervised embeddings-based hypernym discovery

    Get PDF
    Funding: Supported in part by Sardegna Ricerche project OKgraph (CRP 120) and MIUR MIUR PRIN 2017 (2019-2022) project HOPE—High quality Open data Publishing and Enrichment.Peer reviewedPublisher PD

    Abduction and Anonymity in Data Mining

    Get PDF
    This thesis investigates two new research problems that arise in modern data mining: reasoning on data mining results, and privacy implication of data mining results. Most of the data mining algorithms rely on inductive techniques, trying to infer information that is generalized from the input data. But very often this inductive step on raw data is not enough to answer the user questions, and there is the need to process data again using other inference methods. In order to answer high level user needs such as explanation of results, we describe an environment able to perform abductive (hypothetical) reasoning, since often the solutions of such queries can be seen as the set of hypothesis that satisfy some requirements. By using cost-based abduction, we show how classification algorithms can be boosted by performing abductive reasoning over the data mining results, improving the quality of the output. Another growing research area in data mining is the one of privacy-preserving data mining. Due to the availability of large amounts of data, easily collected and stored via computer systems, new applications are emerging, but unfortunately privacy concerns make data mining unsuitable. We study the privacy implications of data mining in a mathematical and logical context, focusing on the anonymity of people whose data are analyzed. A formal theory on anonymity preserving data mining is given, together with a number of anonymity-preserving algorithms for pattern mining. The post-processing improvement on data mining results (w.r.t. utility and privacy) is the central focus of the problems we investigated in this thesis

    Semantic wikis as flexible database interfaces for biomedical applications

    Get PDF
    Several challenges prevent extracting knowledge from biomedical resources, including data heterogeneity and the difficulty to obtain and collaborate on data and annotations by medical doctors. Therefore, flexibility in their representation and interconnection is required; it is also essential to be able to interact easily with such data. In recent years, semantic tools have been developed: semantic wikis are collections of wiki pages that can be annotated with properties and so combine flexibility and expressiveness, two desirable aspects when modeling databases, especially in the dynamic biomedical domain. However, semantics and collaborative analysis of biomedical data is still an unsolved challenge. The aim of this work is to create a tool for easing the design and the setup of semantic databases and to give the possibility to enrich them with biostatistical applications. As a side effect, this will also make them reproducible, fostering their application by other research groups. A command-line software has been developed for creating all structures required by Semantic MediaWiki. Besides, a way to expose statistical analyses as R Shiny applications in the interface is provided, along with a facility to export Prolog predicates for reasoning with external tools. The developed software allowed to create a set of biomedical databases for the Neuroscience Department of the University of Padova in a more automated way. They can be extended with additional qualitative and statistical analyses of data, including for instance regressions, geographical distribution of diseases, and clustering. The software is released as open source-code and published under the GPL-3 license at https://github.com/mfalda/tsv2swm

    Towards trajectory anonymization: a generalization-based approach

    Get PDF
    Trajectory datasets are becoming popular due to the massive usage of GPS and locationbased services. In this paper, we address privacy issues regarding the identification of individuals in static trajectory datasets. We first adopt the notion of k-anonymity to trajectories and propose a novel generalization-based approach for anonymization of trajectories. We further show that releasing anonymized trajectories may still have some privacy leaks. Therefore we propose a randomization based reconstruction algorithm for releasing anonymized trajectory data and also present how the underlying techniques can be adapted to other anonymity standards. The experimental results on real and synthetic trajectory datasets show the effectiveness of the proposed techniques

    Exploiting social internet of things features in cognitive radio

    Get PDF
    Cognitive radio (CR) represents the proper technological solution in case of radio resources scarcity and availability of shared channels. For the deployment of CR solutions, it is important to implement proper sensing procedures, which are aimed at continuously surveying the status of the channels. However, accurate views of the resources status can be achieved only through the cooperation of many sensing devices. For these reasons, in this paper, we propose the utilization of the Social Internet of Things (SIoT) paradigm, according to which objects are capable of establishing social relationships in an autonomous way, with respect to the rules set by their owners. The resulting social network enables faster and trustworthy information/service discovery exploiting the social network of friend'' objects.We first describe the general approach according to which members of the SIoT collaborate to exchange channel status information. Then, we discuss the main features, i.e., the possibility to implement a distributed approach for a low-complexity cooperation and the scalability feature in heterogeneous networks. Simulations have also been run to show the advantages in terms of increased capacity and decreased interference probability

    Towards trajectory anonymization: A generalization-based approach

    Get PDF
    Trajectory datasets are becoming,popular,due,to the massive,usage,of GPS and,location- based services. In this paper, we address privacy issues regarding the identification of individuals in static trajectory datasets. We first adopt the notion of k-anonymity,to trajectories and propose,a novel generalization-based approach,for anonymization,of trajectories. We further show,that releasing anonymized,trajectories may,still have,some,privacy,leaks. Therefore we propose,a randomization based,reconstruction,algorithm,for releasing anonymized,trajectory data and,also present how,the underlying,techniques,can be adapted,to other anonymity,standards. The experimental,results on real and,synthetic trajectory datasets show,the effectiveness of the proposed,techniques

    Dynamic carpooling in urban areas: design and experimentation with a multi-objective route matching algorithm

    Get PDF
    This paper focuses on dynamic carpooling services in urban areas to address the needs of mobility in real-time by proposing a two-fold contribution: a solution with novel features with respect to the current state-of-the-art, which is named CLACSOON and is available on the market; the analysis of the carpooling services performance in the urban area of the city of Cagliari through emulations. Two new features characterize the proposed solution: partial ridesharing, according to which the riders can walk to reach the driver along his/her route when driving to the destination; the possibility to share the ride when the driver has already started the ride by modeling the mobility to reach the driver destination. To analyze which features of the population bring better performance to changing the characteristics of the users, we also conducted emulations. When compared with current solutions, CLACSOON allows for achieving a decrease in the waiting time of around 55% and an increase in the driver and passenger success rates of around 4% and 10%, respectively. Additionally, the proposed features allowed for having an increase in the reduction of the CO2 emission by more than 10% with respect to the traditional carpooling service

    Uplift and magma intrusion at Long Valley caldera from InSAR and gravity measurements

    Get PDF
    The Long Valley caldera (California) formed ~760,000 yr ago following the massive eruption of the Bishop Tuff. Postcaldera volcanism in the Long Valley volcanic fi eld includes lava domes as young as 650 yr. The recent geological unrest is characterized by uplift of the resurgent dome in the central section of the caldera (75 cm in the past 33 yr) and earthquake activity followed by periods of relative quiescence. Since the spring of 1998, the caldera has been in a state of low activity. The cause of unrest is still debated, and hypotheses range from hybrid sources (e.g., magma with a high percentage of volatiles) to hydrothermal fl uid intrusion. Here, we present observations of surface deformation in the Long Valley region based on differential synthetic aperture radar interferometry (InSAR), leveling, global positioning system (GPS), two-color electronic distance meter (EDM), and microgravity data. Thanks to the joint application of InSAR and microgravity data, we are able to unambiguously determine that magma is the cause of unrest

    Urografia-TC multidetettore: ruolo diagnostico nella valutazione del paziente con ematuria non traumatica

    Get PDF
    L’ematuria può originare da qualsiasi tratto dell’apparato urinario e può essere anche unico segno di patologia neoplastica (cancro del rene o della vescica). La letteratura raccomanda pertanto di sottoporre ad attenta valutazione clinico-strumentale tutti i casi di ematuria, macroscopica e microscopica. Lo scopo del presente contributo è quello di definire il ruolo diagnostico dell’urografia-TC multidetettore (uTC-MD) nella valutazione di questo sintomo e analizzarne l’impatto nel management del paziente attraverso lo studio di 181 pazienti consecutivi valutati per macro- e microematuria nel periodo compreso tra gennaio 2003 e marzo 2006

    Metabolic profile of patients with severe endometriosis: a prospective experimental study

    Get PDF
    Endometriosis is a common disease affecting women in reproductive age. There are several hypotheses on the pathogenesis of this disease. Often, its lesions and symptoms overlap with those of many other medical and surgical conditions, causing a delay in diagnosis. Metabolomics represents a useful diagnostic tool for the study of metabolic changes during a different physiological or pathological status. We used 1H-NMR to explore metabolic alteration in a cohort of patients with endometriosis in order to contribute to a better understanding of the pathophysiology of the disease and to suggest new useful biomarkers. Thirty-seven patients were recruited for the metabolomic analysis: 22 patients affected by symptomatic endometriosis and 15 not affected by it. Their serum samples were collected and analyzed with 1H-NMR. Multivariate statistical analysis was conducted, followed by univariate and pathway analyses. Partial Least Square Discriminant Analysis (PLS-DA) was performed to determine the presence of any differences between the non-endometriosis and endometriosis samples (R2X = 0.596, R2Y = 0.713, Q2 = 0.635, and p < 0.0001). β-hydroxybutyric acid and glutamine were significantly increased, whereas tryptophan was significantly decreased in the endometriosis patients. ROC curves were built to test the diagnostic power of the metabolites (β-hydroxybutyric acid: AUC = 0.85 CI = 0.71–0.99; glutamine: AUC = 0.83 CI = 0.68–0.98; tryptophan: AUC = 0.75 CI = 0.54–0.95; β-hydroxybutyric acid + glutamine + tryptophan AUC = 0.92 CI = 0.81–1). The metabolomic approach enabled the identification of several metabolic alterations occurring in women with endometriosis. These findings may provide new bases for a better understanding of the pathophysiological mechanisms of the disease and for the discovery of new biomarkers. Trial registration number NCT0233781
    • …
    corecore